convex regression
Convex Regression with a Penalty
A common way to estimate an unknown convex regression function $f_0: ฮฉ\subset \mathbb{R}^d \rightarrow \mathbb{R}$ from a set of $n$ noisy observations is to fit a convex function that minimizes the sum of squared errors. However, this estimator is known for its tendency to overfit near the boundary of $ฮฉ$, posing significant challenges in real-world applications. In this paper, we introduce a new estimator of $f_0$ that avoids this overfitting by minimizing a penalty on the subgradient while enforcing an upper bound $s_n$ on the sum of squared errors. The key advantage of this method is that $s_n$ can be directly estimated from the data. We establish the uniform almost sure consistency of the proposed estimator and its subgradient over $ฮฉ$ as $n \rightarrow \infty$ and derive convergence rates. The effectiveness of our estimator is illustrated through its application to estimating waiting times in a single-server queue.
Identifiable Convex-Concave Regression via Sub-gradient Regularised Least Squares
We propose a novel nonparametric regression method that models complex input-output relationships as the sum of convex and concave components. The method-Identifiable Convex-Concave Nonparametric Least Squares (ICCNLS)-decomposes the target function into additive shape-constrained components, each represented via sub-gradient-constrained affine functions. To address the affine ambiguity inherent in convex-concave decompositions, we introduce global statistical orthogonality constraints, ensuring that residuals are uncorrelated with both intercept and input variables. This enforces decomposition identifiability and improves interpretability. We further incorporate L1, L2 and elastic net regularisation on sub-gradients to enhance generalisation and promote structural sparsity. The proposed method is evaluated on synthetic and real-world datasets, including healthcare pricing data, and demonstrates improved predictive accuracy and model simplicity compared to conventional CNLS and difference-of-convex (DC) regression approaches. Our results show that statistical identifiability, when paired with convex-concave structure and sub-gradient regularisation, yields interpretable models suited for forecasting, benchmarking, and policy evaluation.
Spectrahedral Regression
O'Reilly, Eliza, Chandrasekaran, Venkat
Convex regression is the problem of fitting a convex function to a data set consisting of input-output pairs. We present a new approach to this problem called spectrahedral regression, in which we fit a spectrahedral function to the data, i.e. a function that is the maximum eigenvalue of an affine matrix expression of the input. This method represents a significant generalization of polyhedral (also called max-affine) regression, in which a polyhedral function (a maximum of a fixed number of affine functions) is fit to the data. We prove bounds on how well spectrahedral functions can approximate arbitrary convex functions via statistical risk analysis. We also analyze an alternating minimization algorithm for the non-convex optimization problem of fitting the best spectrahedral function to a given data set. We show that this algorithm converges geometrically with high probability to a small ball around the optimal parameter given a good initialization. Finally, we demonstrate the utility of our approach with experiments on synthetic data sets as well as real data arising in applications such as economics and engineering design.
Sparse Approximate Solutions to Max-Plus Equations with Application to Multivariate Convex Regression
Tsilivis, Nikos, Tsiamis, Anastasios, Maragos, Petros
R { } is equipped with the standard maximum and sum operations, respectively. It has been used to represent various nonlinear processes, in areas such as scheduling and synchronization [2], [6], [9], geometry [22], control theory and optimization [1], [4], morphological image and signal analysis [15], [24], [28], and machine learning [7], [8], [29], [32], [33]. Max-plus algebra is obtained from the conventional linear algebra if we replace addition with maximum and multiplication with addition, as an extension of the max-plus semiring to multiple dimensions. Hence, many of the aforementioned nonlinear processes enjoy some linear-like properties when described in terms of the max-plus algebra. In this paper we are interested in sparse max-plus representations, i.e. vectors which consist of as many uninformative () elements as possible.
Piecewise Linear Regression via a Difference of Convex Functions
Siahkamari, Ali, Gangrade, Aditya, Kulis, Brian, Saligrama, Venkatesh
We present a new piecewise linear regression methodology that utilizes fitting a difference of convex functions (DC functions) to the data. These are functions $f$ that may be represented as the difference $\phi_1 - \phi_2$ for a choice of convex functions $\phi_1, \phi_2$. The method proceeds by estimating piecewise-liner convex functions, in a manner similar to max-affine regression, whose difference approximates the data. The choice of the function is regularised by a new seminorm over the class of DC functions that controls the $\ell_\infty$ Lipschitz constant of the estimate. The resulting methodology can be efficiently implemented via Quadratic programming even in high dimensions, and is shown to have close to minimax statistical risk. We empirically validate the method, showing it to be practically implementable, and to have comparable performance to existing regression/classification methods on real-world datasets.
Convex Regression in Multidimensions: Suboptimality of Least Squares Estimators
Kur, Gil, Gao, Fuchang, Guntuboyina, Adityanand, Sen, Bodhisattva
The least squares estimator (LSE) is shown to be suboptimal in squared error loss in the usual nonparametric regression model with Gaussian errors for $d \geq 5$ for each of the following families of functions: (i) convex functions supported on a polytope (in fixed design), (ii) bounded convex functions supported on a polytope (in random design), and (iii) convex Lipschitz functions supported on any convex domain (in random design). For each of these families, the risk of the LSE is proved to be of the order $n^{-2/d}$ (up to logarithmic factors) while the minimax risk is $n^{-4/(d+4)}$, for $d \ge 5$. In addition, the first rate of convergence results (worst case and adaptive) for the full convex LSE are established for polytopal domains for all $d \geq 1$. Some new metric entropy results for convex functions are also proved which are of independent interest.
Efficient algorithms for multivariate shape-constrained convex regression problems
Lin, Meixia, Sun, Defeng, Toh, Kim-Chuan
Shape-constrained convex regression problem deals with fitting a convex function to the observed data, where additional constraints are imposed, such as component-wise monotonicity and uniform Lipschitz continuity. This paper provides a comprehensive mechanism for computing the least squares estimator of a multivariate shape-constrained convex regression function in $\mathbb{R}^d$. We prove that the least squares estimator is computable via solving a constrained convex quadratic programming (QP) problem with $(n+1)d$ variables and at least $n(n-1)$ linear inequality constraints, where $n$ is the number of data points. For solving the generally very large-scale convex QP, we design two efficient algorithms, one is the symmetric Gauss-Seidel based alternating direction method of multipliers ({\tt sGS-ADMM}), and the other is the proximal augmented Lagrangian method ({\tt pALM}) with the subproblems solved by the semismooth Newton method ({\tt SSN}). Comprehensive numerical experiments, including those in the pricing of basket options and estimation of production functions in economics, demonstrate that both of our proposed algorithms outperform the state-of-the-art algorithm. The {\tt pALM} is more efficient than the {\tt sGS-ADMM} but the latter has the advantage of being simpler to implement.
Nonparametric Shape-restricted Regression
Guntuboyina, Adityanand, Sen, Bodhisattva
We consider the problem of nonparametric regression under shape constraints. The main examples include isotonic regression (with respect to any partial order), unimodal/convex regression, additive shape-restricted regression, and constrained single index model. We review some of the theoretical properties of the least squares estimator (LSE) in these problems, emphasizing on the adaptive nature of the LSE. In particular, we study the risk behavior of the LSE, and its pointwise limiting distribution theory, with special emphasis to isotonic regression. We survey various methods for constructing pointwise confidence intervals around these shape-restricted functions. We also briefly discuss the computation of the LSE and indicate some open research problems and future directions.